Search CORE

181 research outputs found

Fibonacci Binning

Author: Vigna Sebastiano
Publication venue
Publication date: 22/02/2014
Field of study

This note argues that when dot-plotting distributions typically found in papers about web and social networks (degree distributions, component-size distributions, etc.), and more generally distributions that have high variability in their tail, an exponentially binned version should always be plotted, too, and suggests Fibonacci binning as a visually appealing, easy-to-use and practical choice

arXiv.org e-Print Archive

CiteSeerX

Broadword Implementation of Parenthesis Queries

Author: Vigna Sebastiano
Publication venue
Publication date: 24/01/2013
Field of study

We continue the line of research started in "Broadword Implementation of Rank/Select Queries" proposing broadword (a.k.a. SWAR, "SIMD Within A Register") algorithms for finding matching closed parentheses and the k-th far closed parenthesis. Our algorithms work in time O(log w) on a word of w bits, and contain no branch and no test instruction. On 64-bit (and wider) architectures, these algorithms make it possible to avoid costly tabulations, while providing a very significant speedup with respect to for-loop implementations

arXiv.org e-Print Archive

CiteSeerX

Supremum-Norm Convergence for Step-Asynchronous Successive Overrelaxation on M-matrices

Author: Vigna Sebastiano
Publication venue
Publication date: 12/04/2014
Field of study

Step-asynchronous successive overrelaxation updates the values contained in a single vector using the usual Gau\ss-Seidel-like weighted rule, but arbitrarily mixing old and new values, the only constraint being temporal coherence: you cannot use a value before it has been computed. We show that given a nonnegative real matrix

A

, a

\sigma\geq\rho(A)

and a vector

\boldsymbol w>0

such that

A\boldsymbol w\leq\sigma\boldsymbol w

, every iteration of step-asynchronous successive overrelaxation for the problem

(sI- A)\boldsymbol x=\boldsymbol b

, with

s >\sigma

, reduces geometrically the

\boldsymbol w

-norm of the current error by a factor that we can compute explicitly. Then, we show that given a

\sigma>\rho(A)

it is in principle always possible to compute such a

\boldsymbol w

. This property makes it possible to estimate the supremum norm of the absolute error at each iteration without any additional hypothesis on

A

, even when

A

is so large that computing the product

A\boldsymbol x

is feasible, but estimating the supremum norm of

(sI-A)^{-1}

is not

arXiv.org e-Print Archive

CiteSeerX

Stanford Matrix Considered Harmful

Author: Vigna Sebastiano
Publication venue
Publication date: 01/01/2007
Field of study

This note argues about the validity of web-graph data used in the literature

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

An experimental exploration of Marsaglia's xorshift generators, scrambled

Author: Vigna Sebastiano
Publication venue
Publication date: 01/06/2016
Field of study

Marsaglia proposed recently xorshift generators as a class of very fast, good-quality pseudorandom number generators. Subsequent analysis by Panneton and L'Ecuyer has lowered the expectations raised by Marsaglia's paper, showing several weaknesses of such generators, verified experimentally using the TestU01 suite. Nonetheless, many of the weaknesses of xorshift generators fade away if their result is scrambled by a non-linear operation (as originally suggested by Marsaglia). In this paper we explore the space of possible generators obtained by multiplying the result of a xorshift generator by a suitable constant. We sample generators at 100 equispaced points of their state space and obtain detailed statistics that lead us to choices of parameters that improve on the current ones. We then explore for the first time the space of high-dimensional xorshift generators, following another suggestion in Marsaglia's paper, finding choices of parameters providing periods of length

2^{1024} - 1

and

2^{4096} - 1

. The resulting generators are of extremely high quality, faster than current similar alternatives, and generate long-period sequences passing strong statistical tests using only eight logical operations, one addition and one multiplication by a constant

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

Efficient Optimally Lazy Algorithms for Minimal-Interval Semantics

Author: Boldi Paolo
Vigna Sebastiano
Publication venue
Publication date: 11/08/2016
Field of study

Minimal-interval semantics associates with each query over a document a set of intervals, called witnesses, that are incomparable with respect to inclusion (i.e., they form an antichain): witnesses define the minimal regions of the document satisfying the query. Minimal-interval semantics makes it easy to define and compute several sophisticated proximity operators, provides snippets for user presentation, and can be used to rank documents. In this paper we provide algorithms for computing conjunction and disjunction that are linear in the number of intervals and logarithmic in the number of operands; for additional operators, such as ordered conjunction and Brouwerian difference, we provide linear algorithms. In all cases, space is linear in the number of operands. More importantly, we define a formal notion of optimal laziness, and either prove it, or prove its impossibility, for each algorithm. We cast our results in a general framework of antichains of intervals on total orders, making our algorithms directly applicable to other domains.Comment: 24 pages, 4 figures. A preliminary (now outdated) version was presented at SPIRE 200

arXiv.org e-Print Archive

AIR Universita degli studi di Milano

Four Degrees of Separation, Really

Author: Boldi Paolo
Vigna Sebastiano
Publication venue
Publication date: 01/01/2012
Field of study

We recently measured the average distance of users in the Facebook graph, spurring comments in the scientific community as well as in the general press ("Four Degrees of Separation"). A number of interesting criticisms have been made about the meaningfulness, methods and consequences of the experiment we performed. In this paper we want to discuss some methodological aspects that we deem important to underline in the form of answers to the questions we have read in newspapers, magazines, blogs, or heard from colleagues. We indulge in some reflections on the actual meaning of "average distance" and make a number of side observations showing that, yes, 3.74 "degrees of separation" are really few

arXiv.org e-Print Archive

CiteSeerX

AIR Universita degli studi di Milano